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Abstract: We consider Sinai's random walk in random environment. We prove that the logarithm 
of the local time is a good estimator of the random potential associated to the random environment. 
We give a constructive method allowing us to built the random environment from a single trajectory 
of the random walk. 



1 — '; 1 Introduction and results 

<N ■ 

£> | In this paper we are interested in Sinai's walk i.e., a one dimensional random walk in random environ- 
ment with three conditions on the random environment: two necessaries hypothesis to get a recurrent 
. process (see |Solomon(1975)| ) which is not a simple random walk and an hypothesis of regularity which 
CN \ allows us to have a good control on the fluctuations of the random environment. 



The asymptotic behavior of such walk has been understood by Sinai(1982)| : this walk is sub- 
diffusive and at an instant n it is localized in the neighborhood of a well defined point of the lattice. 
It is well known, see (Zeitouni [2001] for a survey) that this behavior is strongly dependent of the 
random environment or, equivalently, by the associated random potential defined Section 1.2. 

The question we solve here is the following: given a single trajectory of a random walk (Xi, 1 < 
I < n) where the time n is fixed, can we estimate the trajectory of the random potential where the 
walk lives ? Let us remark that the law of this potential is unknown as-well. 



^ In their paper, Adelman and Enriquez(2004)| are interested in the question of the distribution of the 



random environment that could be deduced from a single trajectory of the walk, on the other hand, 
our purpose is to get an approximation of the trajectory of the random potential. 
In the paper V. Baldazzi and Monasson(2006)] the authors are interested in a method to predict the 



sequence of DNA molecules. They model the unzipping of the molecule as a one-dimensional biased 
random walk for the fork position (number of open base pair) k in this landscape. The elementary 
opening (k — ► k + 1) and closing (k — ► k — 1) transitions happen with a probability that depends 
on the unknown sequence. This probability of transition follows an Arrhenius law wich is closed to 
the one we discuss here. The question they answer is: given an unzipping signal can we predict 
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the uniziping sequence ? Their approach is based on a Bayesian inference method which gives very 
good probabilities of prediction for a large amount of data. This means, in term of the walk, several 
trajectory on the same environment. 

Our approach is purely probailistic, it is based on good properties of the local time of the random 
walk which is the amount of time the walk spends on the points of the lattice. We treat a general 
case with a very few information on the random environment. We are able to reconstruct the random 
potential in a significant interval where the walk spends most of its time. Our proof is based on the 
results of Andreoletti(2005j] , in particular in a weak law of large number for the local time on the 
point of localization of the walk. 

The largest part of this paper is devoted to the proof of a theoretical result (Theorem II .7p . we also 
present, at the end of the document, numerical simulations to illustrate our result. We give the main 
steps of the algorithm we use to rebuilt the random potential only by considering a trajectory of the 
walk. As an introduction we would like to comment one of these simulations: 



on of the potenti 




Figure 1: The logarithm of the local time (in blue) and the random potential (in red) 



In blue we have represented the logarithm of the local time and in red the potential associated to 
the random environment. First, remark that we get a good approximation on a large neighborhood 
of the bottom of the valley around the coordinate -80. Outside this neighborhood and especially after 
the coordinate -20, the approximation is not precise at all. We will explain this phenomena by the 
fact that after the walk has reached the bottom of the valley, the walk will not return frequently to 
the points with coordinate larger than -20, so we lose information for this part of the latice. 

Our method of estimation give us two crucial information: a confidence interval for the differencies 
of potential in sup-norm, on an observable set of sites "sufficiently" visited by the walk and a local- 
ization result for the bottom of the valley linked with the hitting time of the maximum of the local 
times. First we need to define the process: 

1.1 Definition of Sinai's walk 

Let a = (c*i,i € Z) be a sequence of i.i.d. random variables taking values in (0,1) defined on the 
probability space (Qi,J-i,Q), this sequence will be called random environment. A random walk in 
random environment (denoted R.W.R.E.) (X n ,n G N) is a sequence of random variable taking value 
in Z, defined on (0, J 7 , P) such that 

• for every fixed environment a, (X n ,n € N) is a Markov chain with the following transition proba- 
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bilities, for all n > 1 and j 6 Z 

F a [X n =* + l|X n _i =i] =04, (1.1) 
P a LY n = * - l|X n _i = i] = 1 - ai = A. 

We denote (^2, J~2, P a ) the probability space associated to this Markov chain. 

.fi = S]ixS] 2 , VAi G JFi and VA 2 G -F 2 , P x A 2 ] = / Al Q(ck>i) F*^ 1 )^). 

The probability measure P a [.| Xo = a] will be denoted P" [.], the expectation associated to P": E", 
and the expectation associated to Q: Eq. 

Now we introduce the hypothesis we will use in all this work. The two following hypothesis are the 
necessaries hypothesis 



E, 



Q 



log 



1 - ao 



0. 



(1.2) 



= a 2 > 0. (1.3) 

|Solomon(1975)| shows that under [L~2l the process (X n ,n G N) is P almost surely recurrent and 11.31 
implies that the model is not reduced to the simple random walk. In addition to 11.21 and 11.31 we will 
consider the following hypothesis of regularity, there exists < r/o < 1/2 such that 

sup {x, Q [ao > x] = 1} = sup {x, Q [a < 1 - x] = 1} > rj . (1.4) 

We call Sinai's random walk the random walk in random environment previously defined with the 
three hypothesis O O and Q 

Let us define the local time C, at k (k G Z) within the interval of time [1, T] (T G N*) of {X n ,n G N) 

T 

£(A;,r)^^I {Xi=fc} . (1.5) 
i=l 

I is the indicator function {k and T can be deterministic or random variables). Let V C Z, we denote 

£ (V, T) = £ £ (J, T) = £ E hx l=3 y (1-6) 
jev i=i j'gv 

To end, we define the following random variables 



Var Q 



log 



1 - Qp 

a 



£*(n) = max(£(fe,n)), F n = {k G Z, £(fc, n) = £*(n)} , (1.7) 
k* = inf{|fc|, k G F n } (1.8) 

£*{n) is the maximum of the local times (for a given instant n), F n is the set of all the favourite sites 
and k* the smallest favorite site. 
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1.2 The random potential and the valleys 

From the random environment we define what we will call random potential, 
Let 

I — a- 

€i = log -, ieZ, (1.9) 

define : 

Definition 1.1. The random potential {S m , m £ Z) associated to the random environment a is 
defined in the following way: 

of Ei<i<fc £ i> if k > o, 

* I -£* + i<i<o*> X k<0, 
50 = 0. 

(S k ,k) 




Figure 2: Trajectory of the random potential 



Definition 1.2. We will say that the triplet {M',m, M"} is a valley if 

SVf' = max St, (1-10) 

M'<t<m 

Sm" = max_ St, (1-H) 

m<i<M" 

S m = min 5 t . (1.12) 

M'<t<M" 

If ?7i is not unique we choose the one with the smallest absolute value. 

Definition 1.3. We will call depth of the valley {M',m,M"} and we will denote it d([M',M"}) the 
quantity 

min(SV - S m , S M " - S m ). (1.13) 
Now we define the operation of refinement 



4 



Definition 1.4. Let {M' ,m, M"} be a valley and let Mi and mi be such that m < Ml < mi < M" 
and 

SMi-S mi = max (S t i—S t "). (1.14) 

m<t'<t"<M" 

We say that the couple (mi, Mi) is obtained by a rig/it refinement of {M',m, M"}. If the couple 
(mi, Mi) is not unique, we will take the one such that mi and Mi have the smallest absolute value. 
In a similar way we define the left refinement operation. 




Figure 3: Depth of a valley and refinement operation 



We denote log 2 = log log, in all this section we will suppose that n is large enough such that log 2 n is 
positive. 

Definition 1.5. Let n > 3, 7 > 0, and T n = logn + 7log 2 n, we say that a valley {M',m,M"} 
contains and is of depth larger than T n if and only if 

LOG [M',M"], 

2. d([M',M"]) >r n , 

3. if m < 0, S M " ~ max m < t < (S t ) > 7k>g 2 n , 
if m > 0, S M > - max <t< m (S t ) > jlog 2 n . 

The basic valley {M n ',m n ,M n } 

We recall the notion of basic valley introduced by Sinai and denoted here {M n ',m n ,M n }. The 
definition we give is inspired by the work of [Kesten(1986)| . First let {M',m n ,M"} be the smallest 
valley that contains and of depth larger than T n . Here smallest means that if we construct, with 
the operation of refinement, other valleys in {M',m n ,M"} such valleys will not satisfy one of the 
properties of Definition 11.51 M n and M n are defined from m n in the following way: if m n > 

M n ' = sup < I € Z_, I < m n , Si - S mn > T n , Si - max S k > 7log 2 n \ , (L15) 

t 0<fc<m n J 

M n = inf {I £Z + , I > m n , Si - S mn > T n } . (1.16) 
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if m n < 

M n ' = sup {leZ_, I < m n , Si - S mn > T n } , (1.17) 

M n = inf < I G Z+, / > m n , 5/ - S mn > F n , Si - max S k > jlog 2 n \ . (1-18) 

m n <k<0 J 

if m n = 

M„' = sup{/ G Z_, Z < 0, Si - S mn > T n } , (1.19) 

M„ = inf {I l> 0, Si - S mn > r„} . (1.20) 

{M n ' ,m n , M n } exists with a Q probability as close to one as we need. In fact it is not difficult to 
prove the following lemma 

AS h ,k) 




Figure 4: Basic valley, case m n > 



Lemma 1.6. There exists c > such that if \ 1.2, \1.3\ and \1.4\ hold, for all 7 > and n we have 

Q [{M n ',m n ,M n } + 0] = 1 - C7 1 1 ° g2n - (1.21) 
L J log n 

Proof. 

One can find the proof of this Lemma in Section 5.2 of Andreoletti(2006"J| . ■ 
1.3 Main results 

We start with some definitions that will be used all along this work. Let i£Z, define 

T = f inf{A; G N*, X k = x} 
x \ +00, if such k does not exist. 



Let n > 1, k G Z, and cq > 0, define: 

1 

logn 



S k,rn n = 1 ~ TZZZ;( S k ~ S rn n ), (1-23) 



An _ log(^(fc,w)) 

6fc " logn ' (L24) 

n„ = ^^. (1.25) 
logn 



Su m is the function of the potential we want to estimate, <ST? is the estimator and u n is an error 
function. 

Now let us define the following random sub-set of 7L: 



i=T k * 



(1.26) 



recall that 7 > 0. This set hZ is fundamental for our result, we notice that it depends only on the 
trajectory of the walk and more especially of its local time: L„ is the set of points for which we are 
able to give an estimator of the the random potential. We will see that this set is large and contains 
a great amount of the points visited by the walk (see Proposition 11.91) . We recall that is the first 
time the walk hit a favorite site. In words, I € L^, if and only if: The local time of the random walker 
in / after the instant T? is large enough (larger than (logn) 7 ). Our main result is the following: 

Theorem 1.7. Assume I l.Bj, and \l.^\ hold, there exists three constants cq, c\, C2 and c' 2 such that 
for all 7 > 6, there exists no such that for all n > no there exists G n C with Q [G n ] > 1 — 4>i(n) 
and 



inf : 

a&Gn 



where 



n{ 

h(n) = 
fa(n) = 



571 O" 



k,m r . 



< U 



>l-<fo{n). 



ci7log 2 n 
logn 

C2 



+ 



(log npl 2 (log n)T- 6 ' 



(1.27) 

(1.28) 
(1.29) 



The fact that our result depends on m n seems to be restrictive, we would like to know where is the 
bottom of the valley only by considering the local time of the walk, we prove the following: 

Proposition 1.8. Assume \ l.S\ 1 1.3\ and \ l.J\ hold, there exists a constant C3 > such that for all'j > 6, 
there exists no such that for all n > uq there exists G n C Oi with Q [G n ] > 1 — <pi{n) and 



inf ] 

a£G„ 



max \m n — x\ < (logo n) 



> 1 - Mn), 



inf Pq 1 [\T„ 

a£G„ 



T k *\ < (logn) 3 ] >l-0 3 (n) 



(1.30) 
(1.31) 



where 4>z(ri) = C3/(logn) 7 6 . 



Notice that the distance between m n (coordinate of the point visited by the walk where the 
minimum of the potential is reached) and a favorite site is negligible comparing to a typical fluctuation 
of the walk (of order (logn) 2 ). Thanks to Proposition 11.81 we can replace fl. 271 in Theorem 11.71 by 



inf : 

a£G„ 



n 

fceL^ 



QTL Q' 



< U r , 



> 1 - <h(n). 



(1.32) 



Now let us give a result giving the main properties of 
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Proposition 1.9. Assume \1.2\ and \1.3\ and \1.4\ hold, there exists a constant C3 > such that for all 
7 > 6 ; there exists hq such that for all n > no there exists G n C f2i with Q [G n ] > 1 — <f>\(n), 



inf P° [£(Ll,n) = n(l - o(l))] > 1 - 2 (n), 
inf [K| «(logn) 2 ] >l-0 2 (n), 



(1.33) 
(1.34) 
(1.35) 



where lim n _> +00 o(l) = 0. 

Remark 1.10. By definition we have F n C Ln. 

Theorem 11.71 is known to be the quenched result that means for a fixed environment a, a simple 
consequence (see Remark 12. 4p is the following annealed result: 



Corollary 1.11. Assume and [77^| hold, there exists three constants cq, c\ and c 2 such that 
for all 7 > 6, there exists hq such that for all n > uq 



n{ 



ok q 

Oh, o. 



k,k* 



< U. 



kehl 



■} 



> 1 - <l>(n), 



(1.36) 



where <p{n) = (j>i(n) + fain)- 



We would like to notice, that for our purpose the result above is not very interesting, because the 
aim is to reconstruct one environment whereas the result above give the mean of the probability for 
the walk over all the possible environments. 



This paper is organized as follows. In Section 2 we give the proof of Theorems 11.71 (we easily get 
the corrolary from Remark I2.4|) . we have split this proof into two parts, the first one deals with the 
random environment and the other one with the random walk itself. In section 3 we sketch the proofs 
of Propositions II .8 1 and II .91 In Section 4, as an application of our result, we present an algorithm and 
some numerical simulations. For completeness, we recall in the appendix, some basic facts on birth 
and death processes. 

2 Proof of Theorem 11.71 

The proof of a result with a random environment involves both arguments and properties for the 
random environment and arguments for the random walk itself. I will start to give the properties I 
need for the random environment. Then we will use it to get the result for the walk. 

2.1 Properties needed for the random environment 
2.1.1 Construction of (G n ,n G N) 
Let k and / be in Z, define 

E%(l)=E%[£(l,T h )] (2.1) 
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in the same way, let A C Z, define 

EUA) = j2nmm- (2.2) 

leA 

Definition 2.1. Let do > 0, d\ > 0, and a; € fix, we will say that a = a(uj) is a good environment 

if there exists no such that for all n > no the sequence (c^, i G Z) = (aj(u;), t £ Z) satisfies the 
properties 12.31 to 12.51 

• {M n ',m n ,M n }/0, (2.3) 

• M n ' > -do^loganlogn) 2 , M n < do^ 1 log 2 nlogn) 2 , (2.4) 

• ^„(iy„) <di(log 2 n) 2 , (2.5) 

where = {M' n ,M' n + 1, • • • , m n , ■ ■ ■ ,M n }. 

Remark 2.2. We will see in Section 2 that we use some results of Andreoletti(2006)] . Considering 
this, we need extra properties on the random environment in addition to the three mentioned above, 
but as we don't need them for our computations we do not make them appear. 

Define the set of good environments 

G n = G n (do,di) = {u> E Q\, a(u>) is a good environment} . (2-6) 
G n depends on do, d\ and n, however we only make explicit the n dependence. 

Proposition 2.3. There exists two constants do > and d\ > such that if \l.S\ \1.S\ and \ l.J\ hold, 
there exists no such that for n > no 

Q [G n ] > 1 - 4>i(n), (2.7) 

where 4>\{n) is given by \ 1.281 
Proof. 

We can find the proof for the first three properties 12. 311231 in Andreoletti(2006 j] , see Definition 4.1 
and Proposition 4.2. ■ 

To end the section we would like to make the following elementary remark on the decomposition of P: 
Remark 2.4. Let C n G a (Xi,i < n) and G n C we have : 

P[C„] = f Q(du>) [ dF a ^ (2.8) 



> / Q(duj) / <iP Q H (2.9) 

J Gn J Cn 

So assume that Q[G n ] = e\{n) > 1 — 4>\{n) and assume that for all uj € G n , J c dF a ^ = e2(uj,n) > 
1 — 4)2 (n) we get that 

P[C„] > ei(n) x min(e 2 (w,n)) > l-0i(n)-0 2 (n). (2.10) 

W^Gn 
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2.2 Arguments for the walk 



Let (pi(n),n € N) a strictly positive decreasing sequence such that lim^—nx) pi(n) = 0. First let us 
show that the Theorem 11.71 is a simple consequence of the following 

Proposition 2.5. Assume \1.2\ I and \1.4\ hold, there exists no such that for all n > no there exists 
G n C ill with Q [G n ] > 1 — 4>i(n) and 



sup <; 

«eG„ 



where Wk^n = pi(n) E eT n ^yy ) , </>i( n ) an d <p2( n ) ore given just after \1. 21 

Taking the logarithm and for n large enough, using Taylor series expansion, we remark that 
#m (k) , , \ \ C(k,n) E« (k) , , . . 



U 



£(k,n) E° n (k) 



n 



< Wk, 



> 1 - <h(n) 



(2.11) 



(2.12) 



implies 

-2 Pl (n) -log(^ n (W n )) < log£(fc,n) -logn - log(^(fc)) < - log(E£ n (W„)) + pi(n) 
rearranging the terms and using IA.1I (see the Appendix) we get 



^-(K(k) - 2 Pl (n)) < S% - Sl mn < ^—{R a n (k) - Pl {n)) 
logn ' logn 



(2.13) 



where R%(k) = log (jj^a kjTnn ) - log(E^ n (W n )) and a Kmn is given by [Q Now using E3 and 
Property 12.51 we get the Theorem. The proof of Proposition 12.51 is based on the following results 



(Lemma ESI of [Andreoletti(2005)| , 



2.2.1 Known facts 

Let (p(n),n 6 N) be a positive decreasing sequence such that lim n _K3o p(n) = 0, we define 



At 



C(m n ,n) 



1 



7? 



> 



p(n) 



A 2 = {T mn < nj '(log n)\C{W n ,n) = l} . 



(2.14) 
(2.15) 



Lemma 2.6. Assume \1.2\ 1 1.3\ and \l-4\ hold, there exists a constant b\ > such that for all 7 > 6, 
there exists no such that for all n > no there exists G n C Vt\ with Q [G n ] > 1 — </>i(n) and 



sup {PS^lHnW, 

"EG' 



(2.16) 



where r\(n) = 61 /(log n) 7 6 . 
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Proof. 

We do not give the details of the computations because the reader can find it in the referenced paper 
(Theorem 3.8 of Andreoletti (2006)] ) , just notice that comparing to the Theorem 3.8 we have a better 



rate of convergence for the probability obtained just by using a weaker result for the concentration of 
the walk. ■ 

We will also need the following elementary fact : 

Lemma 2.7. Assume M.Sk 1 1.3\ and \l-4\ hold, there exists a constant 62 > such that for all 7 > 2, 
there exists no such that for all n > hq there exists G n C tt± with Q [G n ] > 1 — <pi(n) and 

sup {F5[A 2 ]}<r 2 (n), (2.17) 

aeG' n 

where r2(n) = &2/(logn) 7 ~ 2 . 
Proof. 

Once again this can be find in Andreoletti ( 2006 ) ] : Proposition 4.7 and Lemma 4.8. ■ 



Using these results we can give the proof of Proposition 12.51 into two steps : 
2.2.2 Step 1 

Let us define the following subsets : 

V\ = {M' n < k<m n -l, ( max S, - S^) <logn- Jlog 2 n}, (2.18) 

k<j<m n Z 

v$ = {m n + l<k<M n , ( max Sj - S mn ) <logn- Jlog 2 n}, (2.19) 

m n <j<k Z 

and 



l# = «?nt#. (2.20) 

In words Vn is a subset of points included in W n , such that for all k £ Vn the largest difference of 
potential between m n and k is smaller than log n — 7/2 log 2 n. For the walk we will see (Lemma below) 
that if k £ Vn then the walk will hit k after it has reached m n and it will hit this point k a number 
of time large enough (see figure [5]) . 

First let us prove the following Lemma : 

Lemma 2.8. Assume M.Sk 1 1.3\ and \l-4\ hold, there exists a constant 63 > such that for all 7 > 6, 
there exists no such that for all n > no there exists G n C Sli with Q [G n ] > 1 — <fri(n) 

sup {PJKcyj]}>l-r 3 (n), (2.21) 

a£G' n 

where r^(n) = 63 /(log n) 7//2 . 

Notice that is a IP random variable (with two levels of randomness) whereas Vn is only a Q random 
variable (with one level of randomness), this Lemma makes the link between a trajectory of the walk 
and the random environment. 
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(feez) 



log n — 7/2 log 2 n 



Figure 5: with m n > 0, case 1: Dl = max k&1 max k <j< mn (Sj - S mn ) < logn - 7log 2 n 



Proof. 

To prove this Lemma we use Proposition 11.81 First notice that 



»g Kcin] = l-P» 



U e ^nl 



where 



< = {M^ < k < m n - 1, ( max - 5 m J > logn - - log 2 n}, 

k<j<m n Z 

«2 = K + 1 < K M„, ( max 5, - 5 mn ) > logn - - log 2 n}. 

m n <j<k Z 



(2.22) 

(2.23) 
(2.24) 



Let k € let us give an upper bound for 

n[keU, \T k *-T mn \<(\ognf] < 



< 



< 



Y I Xj =k > Qogn)T, \T k »-T mn \ < (log 

j=T k * 



l X,=k > (logn) 7 -(logn) J 



^ I Xj=fc > (logn) 7 -(log 



(2.25) 



for the third inequality we have used the strong Markov property, where 

T, 



M{k > T mnJ _i, X k = m n }, j > 2 
" +00, if such k does not exist. 

T mn ,i=T mn (see [Ol. 
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Now using the Markov inequality and Lemma lA.ll we get 

nW mn [C(k,T mn )\ 



»%[kehl, \T k * -T mn \ < (logn) 3 ] < 



(logn) 7 — (logn)" 



< 



< 



n 



rj exjp(S k - SVn n )( (log n)T - (logn) 3 ) 
1 

r]Q(\ogn)^/ 2 {l — (logn) 3 /(logn) 7 )' 



(2.26) 
(2.27) 
(2.28) 



notice that in the last inequality we have used the fact that k € v™. A similar computation give the 
same inequality when k E v% . Collecting what we did above, and using the Property 12.41 together 
with 1 1.3T1 yields the Lemma. ■ 

2.2.3 Step 2 

This second step is devoted to the proof of the following Lemma. 

Lemma 2.9. For all a and n we have 

£(k,n) E«Jk) 



n 



E«{W n ) 



> W k „, Al, A 2 



< 2exp(-n/2V>2 N) 



(2.29) 



recall that 



E« (*) 



(pi(ra)-p(n)) 2 (a m „A/3 mn ) cxp(-(.S Mfc -S mn )) 



w k ,n - PiWe* {W n ) and V2V 1 ) ~ z iq^H) [F^TI 



(\y ) ™ ■ M k is such 



that S Mk = max^+i^jxfe Sj if k > m n and conversly if k < m n S Mk = m®&k<j<m n -l Sj- 
Proof. 

We essentially use an inequality of concentration (see [Ledoux(200 1) |), for simplicity we only give the 
proof for k > m n , the other case (k < m n ) is very similar. Using the Markov property and the fact 
that C(k,T mn ) = 0, we get 



C(k,n) E« n (k) 



n 



We have 



C(k,n) E«(k) 



> W k ,n,Al,A2 



< 



C(k,n) E« n (k) 



n 



> W k ,n,Ai 



(2.30) 



< 



< 



E^JWn) 
C(k,n) E- n (k) 



> Wk,n,Al 

C(m n ,n) 



E« n (W n , 
£(^)7m„,ni) E^ n (k) 



> Wk,n, 



< 



p(n) 



n 



E« n {W n ) ~ E« n (W n ) 



n E«jW n 
£(k,T mnni ) E mn (k) 



v 



E%(W n 



■ > w k>n 

■(l + p(n))>w' k . 



(2.31) 
(2.32) 
(2.33) 



where n\ 



(1 + p{n)), notice that m is not necessarily an integer but for simplicity we disre- 



E a (k) * 

gard that, and w' kn = e™\w ) ~~ P( n ))- The strong Markov property implies that C(k, T mnjni ] 

is a sum of n\ i.i.d. random variables, the inequality of concentration gives 



£(^> 2~m n ,ni) Em n (k) 



■n 



E«(W n ) 



> w 'k,mAi 



< exp 



n KjWn) K,n) 2 
2Var m „(£(&,T m J)l + p(n) 



.(2.34) 
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With the same method we also get 

£>(k, T mnini ) E mn (k) 



■ m n 



n 



Using IA731 we get Lemma 



2.2.4 End of the proof of the Theorem 

Using Lemmata 12.61 12.71 and 12.81 we have: 



< exp 



n 



k,n> 



2Var mn (£(fc,r m J)l + p(n) 



(2.35) 



U 

fceLj 



C(k,n) EZjk) 



< IK 7 I sup P£ 



C(k,n) E«Jk) 



n 



> w k „,Ai,A 2 



+ 3 max {ri(n)\. 

KK3 



then using Lemma 12.91 we get 

C(k,n) EZ n (k) 



sup 



n 



< 2 sup exp(— n/2^2 {k, n)) 



< 2 exp(-(log n) 7/2 ~ 2 /(Pi W log 2 ")), (2-36) 

where the last inequality comes from the definition of V2 (see I2.20D and the Properties 12.41 and 12.51 
To end we use again the Property 12.41 together with the definition of Vn ■ 

3 Proof of Proposition 11.81 and 11.91 



Sketch of the proof of Proposition \1.8\ 11.301 is an improvement of the proof of Corollary 3.17 of 
|Andreoletti (2006)] in order to get a better rate of convergence for the probability. To get 11.311 
we have used the same idea of the proof of Corollary 3.17 of |Andreoletti(2006"J| , so once again we 
will not repeat the computations here. We recall just the intuitive idea: once the walk has reached 
k*, we know from [L30l that m n is at most at a distance (log 2 n) 2 , therefore the walk need at most 
the amount of time exp(y/((log2 n) 2 )) = (logn) to reach m n . We take (logn) 3 to get a better rate of 
convergence for the probability. 

Sketch of the proof of Proposition \1.9\ The first two properties can be deduced from the following 
inequality, let e > 1, for all n large enough and all a G G n : 



V 2 ^ +e) Qhl] > ct> 3 {n) +n(n) + cre(logt) 2 exp(-(logn) 7+e - 2 (l - (logn) 1 ^)) . (3.1) 



Indeed, thanks to 13.11 we have 

P° [C(hl,n) >n(l-o(l))] >P Q \c{V^ +e \n) > n(l - o(l)) 



(3.2) 



we get 11.331 by using the same method [Andreoletti(2006)| uses to get Theorem 3.1. To get 11.341 we 
only need to show that |V^7 +e | « (logi) 2 , which is a basic fact for a simple random walk. Now, to get 
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13.14 first we notice that, by using a similar method of the proof of Theorem 11.71 we can get 



V^^l <|K7 +e | max K 



keV, 



2(7+e) 



< (log 

3=1 



11 



fain) + r\{n) 



(3.3) 



where (rjj,j) is a i.i.d. sequence with the law of C(k, T mji ). Then using an inequality of concentration, 
we get 13 11 



4 Algorithm and Numerical simulations 

4.1 General and recall of the main definitions 

First notice that we have no criteria to determine wether or not we can apply this method to an 
unknown series of data. All we know is that it works for Sinai's walk, however we can apply the 
following algorithm to every process. Let us recall the basic random variables that will be used for 
our simulations, let x G Z, n G N, 

mf{k X k = x} 

+oo, if such k does not exist. ' 



C(x,n) = ^2l {Xi=x} , (4.2) 



£*(n) = max (C(k, n)) , F n = {k G Z, C(k, n) = C*(n)} , (4.3) 

fcez 

k* = mi{\k\,k e¥ n }. (4.4) 

(4.5) 

We recall also the set the function of the potential we want to estimate and its estimator: 

= | k G Z, I Xj=k > (log rip J , (4.6) 

' log n 

St = lQg( , £(fc ' re)) , (4.8) 
k logn 



We also recall that thanks to Proposition 11.81 i n probability we have \m n — k*\ < cte(log 2 n) 



2 



4.2 Main steps of the algorithm 

Step 1: We have to determine and to get it we have to compute T k * and therefore the local time of 
the process. First we compute C(k, n) for every k, notice that C(k, n) is not equal to zero only if k has 
been visited by the walk within the interval of time [1, n}. Then we can compute C*{n) and determine 
k* and T k * ■ Notice that T k * is not a stopping time, therefore we need two passes to compute what we 
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need. We are now able to determine hZ computing Y\1-t ^X,-=fc- 

Step 2: We can check that is connex, contains k* and that its size is of the order of a typical 
fluctuation of the walk. Now, keeping only the k that belongs to L„ we compute for those k: SJ^ = 
10g iogn' n ^ estimator of the potential. We localize the bottom of the valley m n using k*. 

4.3 Simulations 

For the first simulation (Figure [6]) we show a case where L„ is large i.e. L„ contains most of the points 
visited by the walk. The trajectory of the random potential is in red the interval of confidence in 



Estimation of the potential using the local time 




Figure 6: in red S™ m , in blue S™ — u n , in green S% + u r , 



blue and green. We took n = 500000 and 7 = 7, notice that the larger is 7, the smaller is Ln but 
better is the rate of convergence of the probability. We get that = [10, 94]. In Figure [7] we plot the 



difference S, 



x,m n 



S™ and its the linear regression. We notice that the slope of the linear regression is 



Difference befween the potential and the logarithm of the local time, and its linear regression 




40 50 60 

space x 



Figure 7: in magenta S™ m — 5"™, in red the linear regression 



of order 10 5 . We also notice that we have taken n = 500000, so the error function u„ ~ 'f g3 - ~ 0, 7 

' " log n ' 

this match with the max a: (5" mn — S™) sa 0.8 for this simulation. 
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Now let us choose another example where is much more smaller. For the following simulation 
(Figure [8]) we have only changed the sequence of random number. We get that L„ = [—150, —85]. We 




Figure 8: in red S™ m , in blue — u n , in green + u r , 



notice that for the coordinates larger than -85 and especially after -40, our estimator is not good at 
all. In fact once the walk has reached the minimum of the valley (coordinate -111) it will never reach 
again one of the points of coordinate larger than -40 before n = 500000, so our estimator can not say 
anything about the difference S™ mn — S™. However if we look in the past of the walk and especially 
at a the time T^* which is the first time it has reached the coordinate —111, the favorite point for 
this time is localized around the point —2, so a good estimator between the coordinate -40 and 10 
may be given by ( log ;fg j-T , k). The difference S^ mn — and the linear regression in the interval 
hZ = [—150, —85] is presented Figure M 



Difference between the potential and the logarithm of the local time, and its linear regression 




Figure 9: in magenta S™ mn — S™, in red the linear regression 
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A Basic results for birth and death processes 

For completeness we recall an explicit expression for the mean and an upper bound for the variance 
of the local times at a certain stopping time, we can be found a proof of these elementary facts in 
[Revesz( 1989)1 (P a S e 279 ) 

Lemma A.l. For all a, Let k > m n 

E^ n [C(k,T m J] = ^-^-L-a k , mn , where (A.l) 

Pk C " m n 

Yli-L +1 eSl + eSk 

a k,m n = c ■ (^- 2 ) 

Var m JC(k,T mn )} < 2(E" J£(k, T m J]) 2 \k - m n \. (A.3) 

Pk 

Mf. is such that SM k = m ax mn+ i<j<fc_i Sj. For Q-a.a. environment a 

Vo a mn 1 

< —— a k , mn < — . (A.4) 



1 — o "'K.rtbn — 

1 - Vo Pk Vo 
A similar result is true for k < m n and [C(m n ,T mn )] = 1. 
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